home *** CD-ROM | disk | FTP | other *** search
- _______________________________________________________
- D/Noise 1.0d
- A Digital Audio Denoising Tool
- _______________________________________________________
- Windows 95 version
-
- (C) 1996 Fast Mathematical Algorithms and Hardware Corporation,
- 1020 Sherman Avenue, Hamden, CT 06514.
- <http://www.fmah.com>
-
-
- _______________________________________________________
- INTRODUCTION
- This demonstration version is meant to illustrate some of our current
- work in the area of audio signal processing. It is in no way suited for
- commercial denoising. This version will only operate on monophonic
- 16-bit WAV (Audio Interchange File Format) files. In addition, there
- is a limit on the size of the input file of one million sample points.
-
- This version of D/Noise does not support algorithm iteration, i.e.,
- the denoising algorithm makes only a single pass through an audio file,
- separating it into what it thinks is coherent and what is noise. In the
- next release, you will be able to specify how many times the algorithm
- will pass through a file and its components to achieve a more thorough
- separation. In addition, you will be able to save a compressed version
- of the denoised file as well as well as apply some basic pre- and post-
- processing transforms.
-
- _______________________________________________________
- Installation
- ----------------
- This distribution consists of 4 files which should stay together in one
- folder:
-
- (1) Preliminary documentation (this file)
-
- (2) Dnoise.exe, the shell used to run the algorithm
-
- (3) Denoise.dll, the denoising algorithm in a dynamic link library
-
- (4) Caruso.wav, a sample audio file containing a snippet of Enrico
- CarusoÆs singing, recorded in 1904
-
-
- Opening a WAV file for denoising
- -------------------------------------------------
- D/Noise performs a one-pass denoising procedure on an open WAV
- file. To open a file:
- [1] Select "Open..." from the "File" menu.
- [2] Locate and open a file using the standard file dialog.
-
- You can run the denoising procedure on the entire file or just a short
- segment of it. To select a segment of your source file, click-drag across
- it with the mouse. The toolbar along the top of the main window has a
- couple standard controls for scrolling the wave form representation of
- the file, as well zooming in and out.
-
- The smallest length of the signal you can select is determined by the
- control at the right end of the toolbar at the bottom of the main
- window. This length also determines the size of the sliding signal
- window used in the denoising procedure. A length of 1,024 sample
- points is usually adequate. [NB: the other controls in the bottom
- toolbar are not functional yet and appear disabled.]
-
-
- Setting denoising parameters
- ----------------------------------------
- The outcome of the denoising procedure depends on the settings of
- various parameters. The exact meaning of these parameters is
- explained at the end of this document.
-
- To open the denoising algorithm interface, select "Configure..." from
- the "Denoise" menu.
-
- You can select one of two default parameter sets or enter your own.
- To select a default set, click on the "Default 1" or "Default 2" button in
- the "Parameters" frame.
-
- You can also set your own parameter values. You can use the [tab]
- key to jump from one box to the next. You will get an error message if
- you try to enter a value outside the range of a specific parameter.
-
-
- Running the denoising procedure
- -----------------------------------------------
- [1] Setting the output files
- The denoising process will leave your original input file untouched
- and generate two new files. The first of these two new files will
- contain the coherent ("clean") component of the source file and the
- second will contain the noisy component. In an extended procedure
- you could run the process on the noisy file again to extract even more
- coherent parts and add those to the first clean file. This version of
- D/Noise does not yet support this type of iteration (although you can
- do this "by hand"). In the next release, you will be able to specify a
- number of iterations for the algorithm.
-
- Use the Select... buttons to select names and locations for the two
- output files [Hint: if your files are fairly small and you have RAM to
- spare, you may want to put the output files on a RAM disk to speed up
- the process and minimize disk thrashing. You will need the same
- amount of storage for each the coherent and the noisy file as you need
- for your source file].
-
- [2] Starting the procedure
- Click the [Denoise All] or the [Denoise Selection] button at the
- bottom of the dialog box. The procedure starts and progress
- information is displayed. You can abort the procedure at any time by
- clicking the [Stop] button. Note, that it may take a little while before
- the algorithm stops, as event polling is kept at a minimum in order not
- to slow down the process. When finished, close the dialog box by
- clicking the [Done] button. You can now open and see/hear the
- resulting coherent and noise files.
-
-
- _______________________________________________________
- About the D/Noise Algorithm and its Control Parameters
- by Maxim J. Goldberg and Igor Popovic
- _______________________________________________________
-
- INTRODUCTION
-
- The D/Noise family of algorithms was developed for the purpose of
- removing noise from one dimensional signals, in particular, speech or
- music signals, by the method of denoising proposed by R. Coifman
- and V. Wickerhauser. One starts with a library of orthonormal
- waveforms, which typically includes wavelet packets and local
- trigonometric bases. A signal is expanded in each basis, and a cost
- assigned to the expansion. The basis giving rise to the least cost is
- chosen, the coefficients are ordered by magnitude, and a number of the
- leading terms is kept as the coherent part based on a predetermined
- threshold cost of the remaining terms. These leftover terms constitute
- by definition the noisy part of the signal, and can be treated as a new
- signal which can in turn be expanded and separated into its coherent
- and noisy components.
-
- In D/Noise, we use only one library of bases, those arising from the
- dyadic decomposition tree obtained by constructing local sines on the
- frequencies of a smoothly cut window from the signal. A "best" basis
- is chosen by comparing the cost of a parent node to the sum of the
- costs of the 2 children. In D/Noise, the cost function can be chosen to
- be Shannon entropy or the lp of the coefficients of an expansion. We
- attempt to deal with numerical artifacts arising from the processing by
- (1) allowing shifts in time and frequency, and (2) by segmenting into
- large windows and only using the uncorrupted middle core. The large
- window we are using is 4 times the size of the core. For example, if
- the user selects a signal window of 1,024 samples, internally we slide
- and denoise a window of 4,096 samples and use only its 1,024 wide
- core in the reconstruction. This strategy has proven to give more
- pleasing results than any other "fancy" windowing.
-
-
- PARAMETERS
-
- (1) Window size
- This parameter determines the number of consecutive samples
- processed at one time. Internally, the algorithm slides two "windows"
- of the selected width through the signal, offset by 1/2 their width. In
- addition, each window is extended to both its sides and only the core is
- used in the reconstruction after denoising. The windows should not be
- too narrow, since good frequency resolution is desirable, in particular
- for music. Nor should the windows be too wide, since information
- spread over time might mask local occurrences. For music, it seems
- that a choice of 512, of 1024, or perhaps 2048 are the sizes to consider
- first.
-
- (2) Log2 of reach
- This is the log-base-2 of the size of the smallest interval to be
- considered by the local trigonometric transform decomposition tree.
- For example, the preset of 4 will give you 2^4=16 samples as the
- smallest interval to be considered.
-
- (3) Energy threshold
- This is the energy threshold for discarding coefficients from the
- extracted signal basis. This number, typically .0001, or .000001,
- means that in the chosen basis, those coefficients of size less than
- (energy threshold) * (energy of window segment) are set to zero and
- thus discarded.
-
- (4) Entropy
- A real number alpha, to determine which entropy function will be used
- to separate out the noise component: alpha = 0.0 is Shannon entropy, 0
- < alpha < 1 stands for little-l-sub-p norm, where p is 2*alpha. For
- example, entering 0.5 will result in l1 norm being used.
-
- (5) Entropy ratio
- This real number specifies the threshold mentioned in (3) above. A
- ratio of 1.0 or higher means that all the entries of expansion of the
- window segment will be considered to be coherent, while a ratio of 0.0
- or less means that the entire signal coming from each window will be
- considered to be noise. For music, a good testing entropy ratio may be
- between 0.3 and 0.4 if using Shannon entropy (alpha = 0.0 in (4)
- above); 0.7 works well for alpha = 0.5.
-
- (6) Time shift
- A specific value k means the signal is padded with k zeros in front,
- the whole program is run, and then the output files are shifted back to
- the left by k samples. The purpose of different shifts in time is to have
- the signal window cuts to occur in different places. It is recommended
- that any shifts chosen be prime, or nearly prime numbers, without
- high powers of two occurring in their factorization, and each shift is
- less than one half of the window size set in (1) above.
- As mentioned previously, this version does not yet support any type of
- iteration. If you run the algorithm on the same file specifying a
- different time shift on each run, you will have to average the resulting
- files by hand, i.e. using some audio file mixing utility (a free utility
- package for AIFF files will be included with the next release).
-
- (7) Frequency shifts
- In this field you can enter up to 9 integer numbers, each specifying
- a shift in the frequency domain of the signal. As in (5), the purpose is
- to average out cutting artifacts from the spectrum when performing the
- adapted local trigonometric transform on the signal's frequencies.
- Small primes are recommended, the default presets should suffice.
-